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20.  a-mj  for  establishing  motion  correspondence.  (2)  Abrupt  changes  in 
attributes  that  vary  with  changing  surface  geometry  --  orientation,  density, 
length,  and  width  --  should  be  used  to  identify  discontinuties  in  surface 
geometry  and  surface  structure.  (3)  Texture  tokens  are  needed  to  separate 
the  effects  of  different  physical  processes  operating  on  a  surface.  They 
represent  the  local  structure  of  the  image  texture.  Their  spatial  variation 
can  be  used  in  the  detection  of  texture  discontinuities  and  texture  gradients, 
and  their  temporal  variation  may  be  used  for  establishing  motion  corresondence. 
What  precisely  constitutes  the  texture  tokens  is  unknown;  it  appears,  however, 
that  the  intensity  changes  alone  will  not  suffice,  but  local  groupings  of  them 
may.  (4)  The  above  primitives  need  to  be  assigned  rapidly  over  a  large  range 
in  an  image. 
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Abstract 


This  thesis  explores  how  to  represent  image  texture  in  order  to  obtain  information  about  the 
geometry  and  structure  of  surfaces,  with  particular  emphasis  on  locating  surface  discontinuities. 
Theoretical  and  psychophysical  results  lead  to  the  following  conclusions  for  the  representation  of  image 
texture: 


(1)  A  texture  edge  primitive  is  needed  to  identify  texture  change  contours,  which  are  formed 
by  an  abrupt  change  in  the  2-D  organization  of  similar  items  in  an  image.  The  texture  edge 
can  he  used  for  locating  discontinuities  in  surface  structure  and  surface  geometry  and  for 
establishing  motion  correspondence. 

(2)  Abrupt  changes  in  attributes  that  vary  with  changing  surface  geometry  --  orientation, 
density,  length,  and  width  -  should  be  used  to  identify  discontinutics  in  surface  geometry  and 
surface  structure. 

(3)  Texture  tokens  are  needed  to  separate  the  effects  of  different  physical  processes  operating 
on  a  surface.  They  represent  the  local  structure  of  the  image  texture.  Their  spatial  variation 
can  be  used  in  the  detection  of  texture  discontinuities  and  texture  gradients,  and  their 
temporal  variation  may  be  used  for  establishing  motion  correspondence.  What  precisely 
constitutes  the  texture  tokens  is  unknown;  it  appears,  however,  that  the  intensity  changes 
alone  will  not  suffice,  but  local  groupings  of  them  may. 

(4)  The  above  primitives  need  to  be  assigned  rapidly  over  a  large  range  in  an  image. 
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1.  Introduction 

This  paper  explores  how  to  represent  image  texture  in  order  to  extract  information  about  the 
physical  surfaces.  Recent  work  by  Marr  [1977]  suggests  that  the  description  of  viewed  surfaces  plays  a 
fundamental  role  in  early  visual  processing  and  that  determining  the  form  of  the  descriptions  given  to  the 
image  and  to  the  viewed  surfaces  should  be  one  of  the  first  steps  taken  toward  understanding  early  visual 
processing.  This  paper  analyzes  texture  in  terms  of  these  surface  considerations  and  this  representational 
viewpoint,  investigating  what  aspects  of  texture  should  be  made  explicit  in  an  image  to  obtain 
information  of  the  geometry  and  structure  of  surfaces,  with  particular  emphasis  on  locating  surface 
discontinuities.  This  sets  apart  this  study  of  texture  from  many  others,  which  emphasize  texture 
discrimination,  a  task  that  probably  serves  different  goals. 

In  this  introduction,  we  shall  first  expand  on  the  aforementioned  role  of  surfaces  and 
representations  in  early  visual  processing,  and  on  the  use  of  texture  to  obtain  surface  information.  Some 
methodological  issues  will  then  be  discussed  that  reflect  on  the  current  level  of  understanding  about  the 
representation  of  texture. 

The  role  of  surfaces  in  visual  processing 

The  visual  world  is  composed  mostly  of  surfaces.  An  image  can  thus  be  attributed  to  four 
physical  factors:  the  surface  geometry1  (how  the  surfaces  lie  in  space),  the  surface  reflectance,  the 
illumination,  and  the  viewpoint  [Horn  1977],  For  a  sequence  of  images  separated  in  time  an  additional 
attributing  factor  is  needed:  the  surface  correspondence  between  successive  images  (which  will  be 
non-trivial  if  the  surfaces  arc  in  motion  relative  to  the  viewer).  It  would  be  of  great  value  if  these  factors 
could  be  determined  from  an  image  or  sequence  of  images  since  this  would  provide  information  directly 
of  the  physical  world  Utat  is  present  only  indirectly  in  their  combination  in  an  image.  I  he  human  visual 
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proccssor’s  facility  at  finding  the  shape  and  arrangement  of  visual  ‘"rfaces,  their  lightness  and  color,  the 
location  of  discontinuities  in  surface  orientation,  depth,  and  reflectance  indicates  that  this  information  can 
indeed  be  determined  to  a  considerable  degree.  But  how  is  it  done? 

Using  image  texture  to  infer  surface  information 

The  major  sources  of  information  about  visual  surfaces  in  an  image  include  shading,  stereo, 
motion,  texture  gradients  and  edges.  The  first  several  make  direct  use  of  the  intensity  changes  present  in 
an  image.  Shading  obviously  docs  so.  Marr  &  Foggio  { 1 978]  have  shown  that  the  intensity  changes 

present  at  several  scales  (the  zero-crossings)  are  effective  correspondence  tokens  for  stereo  matching.  |  j 

Ihcsc  intensity  changes  can  also  be  used  to  obtain  directionally  sensitive  motion  information  [Marr  & 

Ulltnan  1981],  The  intensity  changes  in  an  image  thus  seem  to  provide  sufficient  constraint  to  exploit 
these  sources,  and  an  understanding  of  the  intensity  change  description  was  evidently  crucial  to  the 
success  so  far  [Marr  &  Foggio  1978,  Marr  &  Hildreth  1980]. 

A  precise  understanding  of  how  to  distinguish  among  discontinuities  in  surface  orientation, 
depth,  reflectance,  and  illumination,  of  how  to  find  motion  correspondence  over  a  large  range  in  an 
image,  and  of  how  to  obtain  surface  orientation  and  depth  from  texture  gradients  has  proved  more 
elusive.  In  part,  this  may  be  because  the  intensity  changes  in  an  image  alone  do  not  provide  sufficient 
constraint  to  solve  these  problems  easily,  but  that  other  aspects  of  the  2-0  information  in  an  image  such 
as  texture  must  also  be  made  explicit  and  used.  l  et  us  briefly  examine,  in  turn,  each  of  these  latter 
sources  of  surface  information. 

Ihe  location  of  a  discontinuity  in  surface  orientation,  depth,  reflectance,  or  illumination  in  an 
image  often  coincides  with  an  intensity  edge.  But  can  the  physical  type  of  discontinuity  (e.g.  depth 
change,  orientation  change,  illumination  change)  be  determined  from  the  intensities  directly?  By  looking 
at  the  intensity  gradient  at  an  edge.  Ullman's  light  source  deter  lion  opci.uor  can.  in  prinuplc.  distinguish 
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a  pure  reflectance  change  from  other  discontinuity  types  (e.g.  illumination  change)  (Ullman  1976].  By 
examining  the  edge  profiles,  other  edge  parsings  may  be  possible  (Horn  1977],  It  is  not  presently  known 
how  well  edges  can  be  parsed  into  their  physical  correlates  directly  from  intensity  information  in  real 
images.  A  discontinuity  in  image  texture  originates  at  a  discontinuity  in  surface  structure  or  in  surface 
geometry,  and  can  therefore  be  used  to  locate  these  two  kinds  of  physical  discontinuity.  7'he  location  of 
surface  discontinuities  provides  information  that  is  useful,  for  instance,  to  processes  that  must  decide 
where  smooth  surface  assumptions  arc  no  longer  valid,  as  in  the  interpolation  of  a  surface  across  points 
derived  from  stereo  matching.  Considerable  emphasis  will  be  given  to  locating  surface  discontinuities  in 
this  paper. 

Motion  correspondence  across  several  degrees  of  visual  angle  in  successive  images  (at  which 
human's  arc  quite  adept  ~  the  well-known  apparent  motion  effect)  is  considerably  more  difficult  problem 
than  stereo  correspondence  since  it  involves  increased  range,  unknown  direction  of  motion,  and  the 
possibility  of  surface  transformation  over  time.  Given  the  profusion  of  intensity  changes  present  in  a  real 
image,  motion  correspondence  driven  solely  on  the  intensity  changes  results  in  many  candidate  matches 
for  each  motion  token  (e  g.  edge  fragment).  Ullman  [1979]  approached  this  problem  by  assigning  a 
likelihood  to  each  possible  match  between  images  assuming  nearby  matches  were  more  likely,  and 
computing  the  maximum  likelihood  solution  for  that  pair  of  images.  An  alternate  approach  would  be  to 
use  larger  scale  tokens  such  as  texture  discontinuities  and  collincar  groupings,  which  should  have  fewer 
candidate  matches  over  a  given  range  than  the  raw  intensity  changes,  to  bring  die  longer  range  motions 
into  correspondence.  Ullman  noted  that  tokens  that  were  more  abstract  than  the  raw  intensity  changes 
could  be  used  to  establish  motion  correspondence  in  humans,  and  called  them  group  tokens. 

Determining  surface  depth  from  texture  gradients  requires  extracting  a  measure  that  shows  no 


foreshortening  in  an  image:  this  is  necessary  to  factor  out  the  effects  of  changing  surface  oiieutation  from 
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those  due  to  perspective  (Stevens  1981a).  In  Figure  1.1,  surface  depth  cannot  he  obtained  from  the  height 
of  the  ellipses  since  this  measure  is  parallel  to  the  texture  gradient  and  will  vary  both  with  surface  surface 
and  depth.  Thus,  this  distribution  of  heights  could  be  due  to  cither  a  cylinder  (changing  height  due 
mostly  to  changing  surface  orientation)  or  a  receding  plane  (changing  height  due  entirely  to  changing 
depth).  However,  if  the  width  of  the  ellipses  is  used  and  provided  that  the  ellipses  are  congruent  across 
the  surface,  then  surface  depth  can  be  obtained,  since  this  measure  is  perpendicular  to  the  texture 
gradient  and  will  not  show  foreshortening.  Thus,  the  variation  in  ellipse  widths  will  be  due  entirely  to 
changing  depth.  Steven's  method  for  finding  this  measure  with  no  (or  least)  foreshortening  essentially 
assumes  that  a  description  of  image  texture  is  available.  In  particular,  such  information  as  die  position 
and  dimensions  of  small  blobs  in  an  image  would  be  useful,  w  hile  the  location  of  the  intensity  changes 
alone  is  probably  too  primitive  a  description  of  an  image  from  which  to  extract  an  unforeshortened 
measure  directly. 

In  summary,  distinguishing  among  discontinuities  in  surface  orientation,  depth,  reflectance,  and 
illumination,  finding  long-range  motion  correspondence,  and  obtaining  surface  orientation  and  depth 
from  texture  gradients  may  prove  difficult  if  only  the  intensity  changes  arc  examined  directly,  while  if  die 
information  in  image  texture  is  used,  dicse  problems  may  prove  tractable,  llnis  makes  it  imperative  to 
understand  what  aspects  of  image  texture  should  be  identified  in  an  image.  Without  knowing  what 
relevant  data  will  be  available,  it  is  impossible  to  precisely  define,  say.  a  motion  correspondence  process 
or  a  depdt  from  texture  process,  with  the  best  that  can  be  determined  arc  these  processes'  abstract 
computational  needs,  lhus,  we  could  say  that  a  motion  correspondence  process  requires  image  tokens 
that  remain  in  correspondence  with  the  same  physical  feature  in  successive  views  and  for  which  there  arc 
typically  a  small  number  of  possible  matches  over  the  desired  range.  For  depth  from  texture  gradients,  an 


unforeshorteneil  measure  in  the  image  is  needed.  Hut  to  he  much  more  specific  requires  knowing  the 
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Kigurc  1.1  Surface  depth  cannot  be  obtained  from  the  height  of  the  ellipses,  since  this  measure  is  parallel 
to  the  texture  gradient  and  will  vary  both  with  surface  orientation  and  depth.  Surface  depth  can  be 
obtained  from  the  width  of  the  ellipses,  however,  since  this  measure  is  perpendicular  to  the  texture 
gradient  and  will  not  show  foreshortening.  Provided  the  ellipses  are  congruent  across  the  surface,  their 
width  will  be  inversely  proportional  to  their  distance  from  the  viewer,  [figure  courtesy  of  K.  Stevens) 
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form  of  the  input  data,  in  particular,  knowing  what  aspects  of  image  texture  to  detect  in  an  image  and 
how  they  should  be  represented  in  the  visual  system. 

Representational  Emphasis 

We  seek  to  determine  the  early  visual  representation  of  image  texture,  since  the  form  of  the 
description  of  image  texture  must  be  specified  before  its  computation  can  be  specified.  If  the  broad  goals 
of  the  computation  arc  not  well  understood,  but  instead  some  image  computation  is  defined  prematurely, 
the  results  are  likely  to  be  of  little  value  in  the  long  term  to  the  theory  of  vision.  ITiis  representation's 
primitives  -  the  basic  assertions  that  can  be  made  about  image  texture  -  need  to  be  specified,  in 
particular.  Other  important  representational  issues  to  be  determined  include  the  range  and  resolution 
over  which  these  primitives  can  be  assigned  in  an  image,  and  the  referencing  system  for  retrieving  these 
primitives  (sec  Marr  and  Nishihara  [1978]  for  a  discussion  of  visual  representations).  Marr  [1976]  has 
called  the  early  representation  of  the  intensity  changes  and  2-1)  geometric  structure  in  an  image  the  (full) 
primal  sketch  (the  raw  primal  sketch  represents  just  the  intensity  changes). 

The  primal  sketch  is  the  first  of  several  representations  that  Marr  [1977]  secs  as  hav  ing  a  central 
role  in  the  computational  theory  of  vision.  The  primal  sketch  is  used  to  construct  die  2VW)  sketch,  a 
viewer-centered  representation  of  tire  visible  surfaces  in  a  scene.  It  is  in  the  2Mi-D  sketch  that  the  various 
factors  that  produce  an  image  are  separated  --  the  surface  geometry,  surface  reflectance,  the  illumination, 
and  the  viewpoint.  Many  processes  that  provide  surface  information  from  images,  such  as  depth  from 
texture,  can  be  viewed  as  reading  from  the  primal  sketch  and  writing  to  the  2’/i-D  sketch. 

The  term  early  texture  representation  is  used  to  indicate  that  we  arc  interested  here  in  the 
description  of  texture  that  is  produced  early  in  the  visual  processing,  and  is  used  for  extracting  global 
surface  information  (the  creation  of  tire  2VH)  sketch),  and  not  a  much  richer  description  produced  by 
local  scrutiny  (lut  vve  might  expect  exists  lor  the  purposes  ol  recognition.  and  is  much  more  limited  in 
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spccd  and  image  range  than  the  early  texture  representation. 

Informal  definition  of  image  texture  must  precede  its  precise  computational  definition 

It  is  inevitable  that  the  definition  of  image  texture  will  be  imprecise  initially;  we  have  to  rely 
upon  an  intuitive  definition,  'litis  has  been  the  case  with  other  aspects  of  visual  processing.  An  intensity 
edge,  for  instance,  is  informally  defined  as  a  place  in  an  image  where  the  intensity  changes  abruptly,  with 
a  surface  correlate  of  a  discontinuity  in  surface  orientation,  depth,  reflectance,  or  illumination.  Recently, 
Marr  &  Hildreth  [1980]  have  formally  defined  an  edge  in  terms  of  die  spatial  coincidence  of  intensity 
changes  at  two  nearby  scales  found  by  a  convolution  operation  tli.it  w ill  be  described  later.  Their  method 
defines  a  precise  computation  on  an  image  for  detecting  edges,  llie  informal  definition,  however,  existed 
first,  specifying  roughly  what  is  to  be  represented,  and  what  significance  it  has  with  respect  to  physical 
surfaces.  The  formal  definition  then  specifics  how  it  is  to  be  detected  from  an  image.  flic  idea  of 
detecting  abrupt  intensity  changes  is  very  intuitive  and  was  an  important  precursor  to  determining  their 
precise  computation.  The  aspects  of  image  texture  that  should  be  detected  is  not  as  intuitively  obvious. 
Ihus,  we  must  begin  by  understanding  roughly  what  aspects  of  image  texture  should  be  represented  in  an 
image  and  what  arc  their  physical  correlates.  Once  we  have  approximate  definitions  of  what  we  want,  we 
can  then  examine  exactly  how  to  compute  them  from  an  image.  Such  informal  definitions  can  also  be 
used  to  test  for  their  psychophysical  existence. 

This  paper  is  divided  into  two  parts.  Part  I  develops  the  theory  of  the  representation  of  texture, 
and  comprises  Sections  2  through  5.  In  Section  2,  physical  constraints  on  surface  stmeture  arc 
formulated.  In  Section  3  and  4,  two  kinds  of  image  texture  primitives,  the  texture  edge  and  the  texture 
tokens  respectively,  arc  introduced  along  with  the  rationale  for  their  utility  to  the  visual  system.  Section  5 


stimm.it i/es  Part  I. 
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Part  11  of  this  paper  is  devoted  to  demonstrations  of  the  human  visual  system’s  early 
representation  of  texture,  serving  as  a  check  on  the  utility  of  these  primitives  to  a  successful  visual 
processor.  Section  6  describes  demonstrations  supporting  the  existence  of  a  texture  edge  primitive  in  this 
representation,  and  Section  7  describes  demonstrations  that  restrict  the  range  of  what  constitutes  the 
texture  tokens  in  this  representation.  Section  8  summarizes  Part  II. 
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2.  Physical  Constraints  on  Surface  Structure 

An  image  is  a  two-dimensional  projection  of  the  three-dimensional  wor’l  An  important  goal  of 
early  visual  processing  is,  in  a  sense,  to  invert  this  mapping.  If  the  point  in  space  corresponding  to  each 
image  point  could  have  arbitrary  position  and  brightness,  this  task  would  be  impossible.  Our  abilities  to 
perceive  the  3-D  world  visually  indicate,  of  course,  that  this  is  not  the  case.  Ihe  usual  world  must  be 
otherw  isc  constrained.  These  physical  constraints  on  the  visible  world  and  on  die  projected  image  must 
be  identified  in  order  to  understand  how  to  infer  backward  from  an  image.  'lTirce  physical  constraints 
w  ill  be  identified  that  are  relevant  to  surface  structure,  Ifiese  constraints  in  their  original  form  arc  due  to 
Marr  [1981], 

The  predominance  of  surfaces 

In  the  introduction,  the  visible  world  was  considered  composed  mostly  of  surfaces  that  are 
smooth  enough  that  their  local  surface  orientation  could  be  discussed.  For  instance,  a  leaf  defines  such  a 
smooth  surface.  A  hedge  containing  this  leaf  will  itself  define  a  smooth  surface  when  viewed  from 
sufficiently  far  away.  Even  at  distances  where  its  leaves  can  be  resolved  but  the  variation  in  the  distance 
to  them  is  small  relative  to  their  absolute  distance  from  the  viewer,  the  hedge  can  be  considered  an 
approximately  smooth  surface.  Thus,  only  in  a  physical  situation  such  as  a  snowstorm  would  suitable 
surfaces  be  hard  to  define. 

A  leafs  reflectance  function  would  be  fairly  constant  over  its  surface  if  it  were  uniformly 
pigmented.  For  a  hedge,  however,  its  composite  structure  and  the  effects  of  mutual  illumination  and 
occlusion  would  make  the  spatial  variation  of  its  reflectance  function  very  complex.  ITiis  illustrates  our 
first  constraint:  the  visible  world  can  be  regarded  as  being  composed  of  smooth  surfaces  having  reflectance 
functions  whose  spatial  variation  may  be  complex. 
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There  arc  two  consequences  of  this  constraint  in  an  image.  First,  image  points  typically 
originate  from  surface  points.  Second,  it  may  be  very  difficult  to  determine  analytically  the  geometry  of  a 
surface  such  as  a  hedge  from  the  intensity  values  directly  (i.c.  by  treating  it  as  a  shading  problem)  even  if 
the  location  of  the  light  sources  is  known,  because  of  the  complex  nature  of  its  reflectance  function. 

While  an  analytic  statement  of  the  spatial  variation  of  the  hedge's  reflectance  function  may  be 
complex,  defining  its  spatial  structure  with  respect  to  items  that  constitute  it  could  be  less  so.  The  leaves 
that  form  the  hedge’s  surface  may  be  of  uniform  size  and  density.  lTie  leaves  themselves  may  have 
markings  with  their  own  characteristic  attributes.  F.xplicit  descriptions  of  each  of  these  kinds  of  surface 
item  present  in  the  hedge  will  capture  information  that  is  otherwise  buried  in  its  analytic  reflectance 
function.  Two  additional  constraints  formalize  this  notion. 

Different  processes  form  different  kinds  of  surface  items 

A  leaf  and  a  leaf  marking  arc  different  not  only  to  our  senses,  but  they  arc  intrinsically  different 
in  terms  of  their  physical  nature  and  origin.  In  order  to  formalize  this  intuitively  simple  idea,  we  can 
think  of  leaves  as  being  generated  by  some  physical  process  operating  on  a  surface  at  a  given  scale,  while 
leaf  markings  arc  generated  by  some  different  processes  operating  at  a  smaller  scale.  This  provides  the 
second  constraint:  physically  different  processes  operate  on  a  surface  to  form  different  kinds  of  items  there. 
One  set  of  processes  operating  at  a  given  scale,  thus,  determines  the  size  and  shape  of  the  leaves  in  a 
hedge.  Another  forms  the  markings  on  those  leaves.  One  set  of  processes  determines  the  spatial 
arrangement  of  the  hairs  on  an  animal’s  coat.  Others  form  the  spots  and  markings  on  that  coat.  This 
constraint  is  important  because  it  permits  a  physical  distinction  to  be  made  between  those  aspects  of 
surface  structure  that  arc  essentially  the  same  kinds  of  items  (such  as  two  leaves  in  a  hedge),  being  due  to 
the  same  physical  processes,  from  those  that  arc  different  kinds  of  items  (such  as  a  leaf  and  a  leaf 
marking,  or  a  leaf  and  a  brick),  being  due  to  very  different  processes. 
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Items  generated  by  the  same  processes  are  similar 

The  third  constraint  is:  surface  items  generated  by  the  same  physical  processes  tend  to  be  more 
simitar  to  one  another  in  their  size,  shape,  lightness,  color,  and  spatial  arrangement  than  to  surface  items 
generated  by  other  processes.  This  states  that  with  respect  to  these  attributes,  a  leaf  is  more  likely  similar  to 
another  leaf  than,  say,  to  a  brick. 

In  an  image,  the  projection  of  the  surface  items  generated  by  the  same  processes  will  ten  i  to  be 
more  similar  to  one  another  in  size,  shape,  contrast,  color,  orientation,  and  spacing,  than  to  the  projection 
of  other  surface  items  that  are  generated  by  different  processes.  Notc(  however,  that  the  similarity  may  be 
preserved  only  locally  in  an  image.  Changing  surface  geometry  and  perspective  projection  can  destroy 
global  similarity  since  size,  contrast,  orientation,  and  spacing  can  all  vary  with  changing  surface  geometry. 
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3.  The  Texture  Edge 

As  stated  in  the  introduction,  an  important  goal  of  early  visual  processing  is  determining  the 
different  physical  factors  that  produce  an  image.  In  particular,  this  involves  decoupling  surface 
orientation,  depth,  and  the  location  of  discontinuities  in  these  from  surface  reflectance  and  illumination. 
In  this  section,  we  shall  focus  on  surface  discontinuities.  We  shall  see  that  one  consequence  of  the 
previous  section's  constraints  is  that  abrupt  changes  in  texture  in  an  image  can  be  used  to  identify 
discontinuities  in  surface  geometry  and  surface  structure. 

The  location  of  surface  discontinuities  is  not  explicit  in  the  intensity  changes 

The  location  of  discontinuities  in  surface  structure  or  surface  geometry  arc  not  vet  explicit  in  the 
intensity  changes.  ITicrc  may  be  a  myriad  of  contours  present  in  the  intensity  changes,  only  a  few  of 
which  coincide  with  a  discontinuity  in  surface  geometry  or  surface  structure.  Others  will  be  due  to  the 
internal  structure  of  a  surface  or  to  shadows  and  highlights.  For  example,  in  Figure  3.1  the  bottom-most 
horizontal  line,  which  coincides  with  the  texture  boundary,  may  indeed  be  present  in  tire  intensity 
changes  but  nothing  there  distinguishes  it  from  the  other  horizontal  lines,  also  present  in  die  intensity 
changes,  as  the  location  of  a  texture  change  in  the  image,  and  thus  die  likely  location  of  changing  surface 
structure  or  surface  geometry  (c.g.  a  brick  wall  abutting  a  grass  lawn).  Hierc  may  even  be  no  significant 
intensity  change  coinciding  with  the  image  of  a  surface  discontinuity,  while  contours  defined  by  the 
image  structure  may  still  be  present  there.  It  is  the  image  structure  contours  that  hold  die  key  to 
identifying  discontinuities  in  surface  geometry  and  surface  structure. 

T  wo  types  of  image  structure  contours 

Not  every  contour  in  an  image  is  defined  solely  by  intensity  changes  coincident  with  die 
contour  A  contour  can  also  be  defined  by  image  Mnictine  and  in  .it  least  two  dilleienl  w  i\s.  One  kind 
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Figurc  3.1  There  arc  many  contours  in  this  figure  that  are  explicit  in  the  intensity  changes;  for  instance, 
the  bottom-most  horizontal  line  at  the  texture  boundary  is  present  there.  Nevertheless,  tins  line  has  not 
yet  been  distinguished  from  the  other  horizontal  lines,  which  arc  also  present  in  the  intensity  changes,  as 
the  location  of  a  texture  discontinuity  in  the  image,  l  ocating  such  abrupt  texture  changes  in  an  image  is 
important,  since  they  identify  the  likely  location  of  discontinuities  in  surface  structure  or  surface 
geometry. 


e 


-18- 


can  be  created  by  an  abrupt  change  in  some  2-D  organization  in  an  image.  In  Figure  3.2.  for  example,  the 
45°  change  in  the  orientation  of  the  line  segments  defines  a  contour  that  corresponds  to  the  boundary 
between  the  two  oriented  regions.  A  sudden  change  in  local  density  of  the  line  segments  in  this  figure 
also  defines  such  a  contour,  which  corresponds  to  the  external  boundary  of  the  two  regions,  with  the  line 
segment  density  vanishing  outside  these  regions.  We  shall  refer  to  such  contours  as  texture  change 
contours.  A  second  kind  of  contour  can  be  defined  by  the  local  alignment  of  various  image  features.  For 
example,  the  local  alignment  of  the  terminations  of  the  lines  in  Figure  3.3  defines  such  a  contour.  We 
shall  call  these  alignment  contours. 

We  explore  texture  change  contours  and  their  use  in  identifying  discontinuities  in  surface 
geometry  and  surface  structure  in  this  section.  Alignment  contours  will,  for  the  most  part,  not  be  treated 
in  this  paper.  Let  us  examine  next  the  relationship  between  texture  change  contours  and  surface 
discontinuities. 

Discontinuities  due  solely  to  changing  surface  structure 

First,  consider  a  discontinuity  in  surface  geometry  where  the  surface  reflectance  function  is 
constant  across  the  discontinuity.  Examples  of  this  arc  two  surface  fragments  that  are  adjacent  in  an 
image  and  have  the  same  surface  structure  and  coloration  but  have  different  surface  orientation,  depth,  or 
rotation.  For  instance.  Figure  3.2  could  be  the  image  of  a  creased  surface  as  shown  in  Figure  3.4a  or, 
instead,  it  could  be  the  image  of  two  surfaces,  one  rotated  45°  with  respect  to  the  other  as  shown  in 
Figure  3.4b.  Figure  3.5  could  be  the  image  of  two  similarly  textured  surfaces  differing  in  depth  (one  V  2 
farther  away  than  the  other),  or  again  it  could  be  a  creased  surface  (with,  say,  one  side  parallel  to  the 
image  plane  and  the  other  side  at  a  60°  slant). 

From  the  constraints  of  the  previous  section,  the  image  of  a  local  patch  of  a  structured  surface 
where  the  surlace  geometry  does  not  change  much  will  likely  contain,  at  panic ulai  stales,  items  that  arc 
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Figure  3.4  Two  of  several  possible  physical  origins  for  figure  3.2:  (a)  a  creased  surface,  and  (b)  a  surface 
rotated  relative  to  another  surface  w  ith  the  same  surface  structure. 


similar  to  one  another  in  orientation,  spacing,  color,  contrast,  si/e.  and  shape.  But  where  die  surface 
geometry  changes,  geometric  attributes  such  as  orientation,  density,  and  length  of  die  image  of  die  surface 
items  will  change.  (Intensity,  contrast,  and  color  can  also  vary  with  changing  surface  geometry,  although 
large  contrast  and  color  changes  are  unlikely  since  these  would  require  perverse  illumination  or 
reflectance  functions.)  Thus,  at  a  discontinuity  due  solely  to  changing  surface  geometry,  diere  will  often 
be  an  abrupt  change  in  these  geometric  attributes  of  the  image  of  similar  surface  items,  forming  a  texture 
change  contour. 

Discontinuities  due  to  changing  surface  structure 

There  is  another  physical  source  of  texture  change  contours  in  an  image,  and  this  represents  the 
other  basic  type  of  surface  discontinuity  --  one  due  to  changing  surface  structure.  For  instance.  F'igure  3.5 
could  be  the  image  of  two  adjacent  surfaces  lying  in  the  same  plane  dial  have  different  dot  dcnsiues. 
When  surface  structure  changes,  the  similarity  constraint  of  Section  2  indicates  that  items  at  given  scales 
on  one  surface  will  likely  be  more  similar  to  one  another  in  orientation,  color,  contrast,  si/e,  and  shape 
than  to  items  on  the  other  surface,  resulting  in  abrupt  changes  in  the  items  at  each  scale  at  die  image 
location  of  die  surface  discontinuity,  and  giving  rise  to  a  texture  change  contour.  In  diis  case,  however, 
any  surface  attribute  can  change,  not  just  geometric  attributes,  the  surface  stniclurc  can  change  arbitrarily 
across  this  kind  of  surface  discontinuity. 

Texture  change  contours  need  to  be  nude  explicit 

We  have  seen  above  that  a  texture  change  contour  can  be  loaned  by  a  discontinuity  in  sirfacc 
geometry  or  surface  structure.  A  texture  change  contour  can  be  due  finally  to  some  combination  of  these 
factors.  Thus,  a  texture  change  contour  identifies  the  likely  location  of  a  surface  discontinuity  of  some 
form.  This  alone  makes  die  representation  ot  texture  change  contouis  valuable  since,  as  we  saw  above. 
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the  location  of  surface  discontinuities  may  not  be  present  explicitly  in  the  intensity  changes.  Ibis 
represents  the  first  major  implication  for  the  early  texture  representation:  texture  change  contours  should 
be  made  explicit  in  the  image  since  they  identify  the  likely  location  of  discontinuities  m  surface  geometry  or 
surface  structure,  information  that  may  not  be  explicit  in  the  intensity  changes  alone. 

Separating  the  physical  factors  that  produce  texture  change  contours 

Is  it  possible  from  an  image  to  distinguish  among  those  texture  change  contours  due  solely  to 
changing  surface  geometry,  those  due  solely  to  changing  surface  structure,  and  those  due  to  some 
combination  of  these  two  factors?  Unfortunately,  the  answer  is  that  this  cannot  always  be  achieved  from 
image  texture  information  alone.  When  the  surface  structure  changes  completely,  forming  a  texture 
change  contour,  there  is  no  information  in  the  image  texture  about  whether  the  surface  geometry  changes 
there  also.  A  structural  change  can  also  mimic  a  geometric  change  as.  for  example,  when  I-'igurc  3.5  is  due 
to  a  change  in  surface  dot  density,  and  not  to  a  change  in  depth.  However,  it  is  possible  to  distinguish 
between  those  texture  change  contours  that  could  be  due  solely  to  change  in  surface  geometry,  and  those 
that  must  involve  some  surface  structure  change.  Ihe  former  contain  only  gio>n<  on  ( hany.es  in  the  image 
of  the  surface  items  across  the  texture  change  contour:  it  would  be  possible  >\;th  Mutable  3-D 
configurations  of  two  surfaces  having  the  same  surface  structure  to  project  in  the  image  as  each  of  these 
texture  changes.  The  latter  contain  non-geumetric  changes,  as  in  figure  Vn  \,>  , hancc  in  surface 
geometry  can  cause  the  squares  in  this  figure  to  he  transformed  into  dots  having  the  same  density  as  the 
squares.  Instead,  the  surface  structure  must  have  changed.  At  the  end  of  this  section,  we  shall  explore 
how  to  distinguish  between  geometric  and  non-gcomctric  texture  changes. 

The  texture  edge  primitive  and  its  uses 

I  lie  representation  ol  an  intensity  change  contour  begins  w  nil  intensity  edge  and  hai  primitives. 
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which  arc  local  assertions  assigned  at  many  points  along  the  contour  that  make  explicit  the  position,  local 
orientation,  contrast,  and  width  [Marr  1976.  Marr  &  Hildreth  1980],  Analogous  to  this,  points  along  a 
texture  change  contour  in  an  image  can  be  represented  in  our  early  texture  representation  by  a  texture 
edge  primitive,  which  makes  explicit  local  contour  position  and  orientation  at  the  very  least. 

We  have  already  seen  above  that  the  representation  of  texture  change  contours  is  important  for 
detecting  surface  discontinuities  and  can  be  used  to  distinguish  between  those  discontinuities  that 
possibly  could  be  due  solely  to  a  change  in  surface  geometry  and  those  that  cannot.  In  addition  to  this, 
the  texture  edge  primitive  could  be  useful  for  establishing  motion  correspondence.  Given  the  many 
possible  candidate  matches  of  edge  and  bar  fragments  for  motion  correspondence  over  several  degrees  of 
visual  angle,  the  larger  scale  and  rarer  texture  edges  give  fewer  possible  matches  over  a  given  range. 

Range  of  the  representation 

An  issue  of  particular  importance  is  the  range  in  an  image  over  which  this  texture  edge  primitive 
can  be  assigned,  since  this  determines,  in  part,  the  computational  burden  of  forming  the  early  texture 
representation.  One  extreme  of  this  range  would  be  a  representation  that  encompasses  only  a  very  small 
portion  of  an  image  (c.g.  the  fovea)  at  one  time,  or  that  allows  only  a  very  few  primitives  to  be  assigned  at 
one  time.  At  the  other  extreme  would  be  a  representation  that  encompasses  the  entire  image  and  can 
allow  as  many  primitive  assignments  as  image  resolution  permits.  While  it  is  difficult  at  this  point  to  say 
precisely  where  in  this  range  our  early  representation  of  texture  should  lie,  it  can  be  said  that  it  must  lie 
closer  to  a  full  image  range  representation  than  to  a  very  restricted  but  economical  one  that  can  represent 
only  a  small  fraction  of  the  texture  edges  found  in  an  image.  Very  limited  range  or  resolution  may  have 
be  appropriate  for  some  visual  representations,  but  such  limitations  are  undesirable  for  the  early 
representation  of  image  texture  considering  the  uses  to  which  this  representation  will  be  put. 

As  previously  outlined,  the  full  primal  sketch,  which  tepresents  both  the  intensitv  changes  and 
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image  structure,  serves  as  the  basic  description  of  an  image  from  which  the  2  '/it  - 1 3  sketch,  a 
viewer-centered  representation  of  the  viewed  surfaces  in  space,  is  formed,  in  this  framework,  the  early 
texture  representation  is  considered  a  part  of  the  full  primal  sketch.  Further,  the  formation  of  the  I'/i-D 
sketch’s  description  of  the  viewed  surfaces  --  their  orientation,  depth,  reflectance,  location  of 
discontinuities  -  is  a  fundamental  goal  of  early  visual  processing.  If.  as  has  been  argued  above,  the 
texture  edge  primitive  makes  explicit  aspects  of  image  structure  that  arc  useful  for  creating  a 
representation  of  surfaces  present  throughout  an  image,  then  it  follows  that  texture  edges  must  be 
detected  rapidly  throughout  the  image.  This  is  an  expensive  step,-  since  it  requires  that  considerable 
computational  resources  be  brought  to  bear  if  an  entire  image  is  to  be  processed  in  a  fraction  of  a  second. 
Next,  as  texture  edges  arc  detected  throughout  an  image,  they  need  to  be  stored  away  somewhere,  and  the 
most  direct  way  to  do  this  is  in  a  representational  memory  encompassing  the  entire  image,  lhis  is 
particularly  important  for  establishing  large  range  motion  correspondence  using  texture  edges,  since  there 
is  a  wide  image  range  over  which  a  particular  token  could  move,  lhis  approach  may  seem 
computationally  expensive  compared  to  the  use  of  a  scrutinizing  processor  for  local  analysis  of  surface 
structure  that  is  directed  more  leisurely  across  the  image.  But  such  a  local  scrutinizing  processor  would 
be  inherently  too  slow  to  rapidly  cover  large  portions  of  an  image  and  feed  as  input  to  die  2Vfc-l)  sketch. 

Detecting  texture  edges 

Conceptually,  the  detection  of  texture  edges  can  be  divided  into  two  major  steps.  First,  the  basic 
structural  elements  that  will  be  used  to  represent  the  image  texture  locally  must  be  made  explicit.  We 
shall  call  these  primitive  elements  the  texture  token.',.  Second,  the  spatial  variation  of  these  tokens  arc 
used  to  locate  texture  edges.  It  is  not  presently  known  what  constitutes  the  texture  tokens:  this  could 
conceivably  range  from  grey-level  values  to  primitives  that  reptesent  individual  texture  elements  and 
their  attributes  such  av  orientation,  length,  width,  contrast,  shape,  and  coloi  (eg  each  line  segment  in 
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Figure  3.2).  In  Section  4,  we  shall  see  that  the  range  in  which  the  texture  tokens  lie  can  be  restricted,  but 
their  precise  form  has  yet  to  be  resolved.  Until  it  is.  it  will  be  difficult  to  say  much  about  methods  for 
detecting  texture  edges. 

One  issue  that  can  be  discussed  at  this  time,  however,  is  the  desirable  dimensions  for  the  texture 
token  attributes.  We  saw  above  that  at  a  discontinuity  due  solely  to  changing  surface  geometry  (constant 
surface  structure  across  the  discontinuity),  it  will  be  geometric  dimensions  such  as  orientation,  length,  and 
width  that  will  vary  with  tire  changing  surface  geometry.  It  would  therefore  be  desirable  to  have  texture 
tokens  that  have  attributes  that  change  w  hen  the  surface  geometry  changes,  if  discontinuities  due  solely  to 
changing  surface  geometry  arc  to  be  detected. 

Discontinuities  in  surface  structure  can  be  detected  in  two  ways.  One  way  utilizes  geometric 
attributes.  When  the  surface  structure  changes,  everything  is  likely  to  change  including  the  geometric 
attributes  given  above.  For  example,  the  change  in  size  of  the  items  in  Figure  3.6  could  be  used  to 
identify  the  boundary  between  the  two  regions.  A  second  way  to  detect  discontinuities  in  surface 
structure  would  use  changes  in  structural  attributes.  For  example,  the  number  of  corners  per  item  ;n 
Figure  3.6  could  be  used  to  identify  die  texture  boundary  between  the  two  regions,  since  in  the  left-hand 
region  there  arc  four  corners  per  item  (square),  while  in  the  right-hand  region  there  arc  zero  per  item 
(dot).  This  second  method  would  be  useful  when  all  geometric  attributes  happen  to  match  across  the 
texture  boundary  causing  the  first  method  to  fail.  Whether  this  is  likely  to  occur  in  natural  images  is 
uncertain  however;  a  point  that  we  shall  return  to  in  Section  6. 

We  have  not  yet  discussed  how  to  distinguish  between  disconlinuties  due  solely  to  changing 
surface  geometry  from  those  that  contain  structural  changes,  but  only  how  to  detect  either  kind  when 
present.  For  instance,  we  saw  above  that  the  changing  size  of  the  image  of  surface  items  could  be  used  in 
some  cases  to  detect  either  kind  of  discontinuity,  but  it  would  not  distinguish  between  them.  I  el  us  turn 
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to  this  issue  next. 

Distinguishing  geometric  and  non-geometric  texture  change  contours 

How  can  texture  changes  contours  that  possibly  are  due  solely  to  a  change  in  surface  geometry 
be  distinguished  from  those  that  must  involve  some  non-gcometric,  structural  change?  When  the  surface 
geometry  changes  but  surface  structure  does  not  at  a  texture  change  contour,  many  image  properties 
usually  remain  invariant:  the  number  of  different  scales  at  which  surface  items  occur  on  a  surface,  the 
approximate  contrast,  color,  and  packing  factor  (how  tightly  packed)  of  the  items  at  each  scale,  and 
whether  or  not  they  are  oriented.  When  surface  structure  changes  at  a  texture  change  contour,  everything 
is  likely  to  change  including  the  above  geometric  invariants.  A  procedure  that  utilizes  such  geometric 
invariants  would  thus  seldom  err  in  distinguishing  geometric  from  non-gcometric  contours. 
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4.  The  Texture  Tokens 

Using  image  texture  to  infer  surface  information  involves  two  broad  stages.  In  die  first  stage, 
the  basic  dements  that  are  to  represent  the  local  structure  of  the  texture,  which  we  shall  call  the  texture 
tokens,  arc  made  explicit.  In  the  second  stage,  the  spatial  variation  of  these  tokens  can  be  used  to  infer 
local  surface  orientation,  surface  depth,  and  the  location  of  surface  discontinuities,  and  their  temporal 
variation  may  be  used  to  infer  motion  correspondence.  It  is  not  presently  known  what  constitutes  the 
texture  tokens  of  the  first  stage;  this  could  conceivably  range  from  grey-level  values  to  intensity  changes 
to  primitives  that  represent  individual  texture  elements  and  their  attributes  such  as  small  blobs  of  a 
particular  orientation,  contrast,  and  size.  This  section  explores  the  nature  of  the  texture  tokens  and 
attempts  to  restrict  this  range. 

Separating  the  effects  of  different  surface  processes 

A  major  function  the  texture  tokens  must  serve  is  separating  the  effects  of  different  surface 
processes  in  an  image.  As  Section  2  stated,  surface  structure  is  often  due  to  different  physical  processes 
operating  on  a  surface,  each  at  it  own  scale.  Items  generated  by  a  given  process  on  that  surface  will  often 
be  similar  to  one  another  in  attributes  such  as  size,  shape,  orientation,  color,  and  contrast,  lhc  spatial 
variation  of  the  projection  of  these  items  in  an  image  can  provide  information  about  the  structure  and 
3-D  geometry  of  the  surface  on  which  the  items  reside;  for  instance,  a  discontinuity  in  die  orientation  of 
similar  items  in  an  image  can  signal  a  discontinuity  in  surface  geometry  or  surface  structure  (sec  Section 
3).  To  utilize  this  information,  however,  it  is  necessary  to  separate  the  effects  of  different  processes,  for 
otherwise  any  useful  information  carried  by  items  generated  by  a  given  physical  pun  ess  will  be  obscured 
in  an  image  by  the  effects  of  other  processes  also  operating  there,  l  or  e-  music,  if  the  common 
orientation  of  bricks  in  a  wall  is  to  be  appreciated,  then  it  is  pielci.ible  dial  neither  maikines  on  those 
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bricks  nor  large  spots  encompassing  several  bricks  interfere  with  the  description  of  the  organization  of  the 
bricks  themselves. 

The  role  of  scale  in  separating  the  effects  of  different  processes 

Since  different  physical  processes  often  operate  at  different  scales  on  a  surface,  the  particular 
scale  at  which  an  image  of  such  a  surface  is  examined  should  be  a  useful  factor  for  separating  the  effects 
of  the  different  processes  operating  there.  For  example,  if  Figure  4.1  is  examined  at  very  small  scales, 
then  neither  a  change  in  the  distribution  of  grey-level  values  nor  a  change  in  the  orientation  distribution 
of  the  intensity  changes  can  identify  the  boundary  between  the  two  regions  that  arc  composed  of  w’s  of 
differing  orientation,  since  the  amount  of  ink  per  unit  area  is  the  same  on  each  side  of  this  boundary,  and 
the  orientation  distribution  of  the  component  line  segments  is  the  also  same  on  each  side  of  the  boundary 
*■  50%  arc  horizontal  and  50%  are  vertical.  The  orientation  information  needed  to  identify  the  boundary 
between  the  two  regions  is  carried  at  a  larger  scale  in  the  orientation  of  each  w  as  a  whole,  and  not  at  a 
smaller  scale  in  the  orientation  distribution  of  its  component  line  segments. 

The  intensity  changes  at  a  particular  scale  can  be  made  explicit  using  a  method  developed  by 
Marr  &  Hildreth  J1980J.  In  their  theory  of  edge  detection,  they  propose  that  an  intensity  change  in  an 
image  l(x,y)  at  a  particular  scale  can  be  found  by  (in  effect)  first  smoothing  the  image  with  a  Gaussian 
filter  G  of  the  desired  bandwidth,  and  then  applying  the  Lapiacian  operator  V2  to  the  smoothed  image. 
The  loci  of  zero-crossings  in  VJ(G  *  I)  =  V2G  *  1  define  the  location  of  intensity  changes  at  that  scale. 
Figure  4.2  shows  the  zero-crossings  in  the  convolution  of  Figure  4.1  with  a  V2G  operator  having  an 
excitatory  region  of  width  about  the  same  as  the  width  of  the  w’s.  Note  that  at  this  scale,  the  approximate 
boundaries  defined  by  the  individual  w's  comprise  (he  zero-crossings.  Thus,  the  predominant  local 
orientation  of  the  zero-crossings  is  the  same  as  the  local  orientation  of  the  w’s.  and  the  significant  change 
in  their  orientation  at  the  boundary  between  the  two  legions  in  I  igute  4.1  could  be  used  to  make  that 
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Figure  4.1  The  orientation  distribution  of  the  component  line  segments  is  the  same  in  both  the  left  and 
right  regions  of  this  figure  --  50%  of  the  line  segments  are  horizontal  and  50%  are  \ertical.  It  is  the 
changing  orientation  of  the  individual  w’s  and  not  their  component  line  segments  that  defines  the  texture 
boundary. 
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Figure  4.2  The  zero-crossings  of  Figure  4.1  when  convolved  with  a  V~G  operator  having  an  excitatory 
region  with  width  about  the  same  as  live  width  of  the  it's.  Since  the  zero-crossings  at  this  scale  make 
explicit  the  rough  boundary  defined  by  each  w.  the  local  predominant  orientation  of  the  zero-crossings 
will  match  the  orientation  defined  by  the  individual  w's.  and  will  change  significantly  at  the  texture 
boundary. 


-33- 

boundary  explicit,  Thus.  we  see  that  if  this  image  is  examined  at  the  appropriate  scale,  the  effects  of  the 
process  that  determines  the  orientation  of  each  w  can  be  separated  from  those  smaller  scale  processes  that 
determine  its  component  line  segment  structure,  and  in  this  case  the  intensity  changes  at  that  larger  scale 
arc  sufficient  to  separate  the  approximate  boundaries  of  the  w’s  from  their  internal  structure. 

The  V2G  operator  can  also  be  used  in  certain  cases  to  find  intensity  changes  that  are  coincident 
with  the  texture  boundary  itself.  Figure  6.8,  consisting  of  convolutions  of  a  90°  change  in  orientation  of 
small  line  segments  shows,  however,  that  there  need  not  be  any  significant  intensity  changes  present 
there.  In  fact,  we  should  not  expect  any  to  be  there  unless  the  average  intensity  changes  between  the 
textured  regions  on  each  side  of  the  texture  boundary. 

The  raw  intensity  changes  are  not  always  sufficient  for  separating  the  effects  of  different  processes 

In  view  of  Figure  4.2,  it  would  be  tempting  to  dunk  that  the  V?G  zero-crossings  at  various 
scales  may  be  sufficient  as  the  set  of  texture  tokens.  'ITicre  arc,  however,  physical  reasons  dial  we  should 
not  expect  this  to  be  so.  The  intensity  changes  at  a  given  scale  will  not  solely  correspond  to  structural 
items  at  a  particular  scale,  but  will  be  affected  to  some  degree  by  items  at  all  scales  and  their  affect  will 
vary  with  die  contrast  of' these  items.  In  the  brick  wall  example,  high  contrast  markings  on  the  bricks 
could  noticeably  influence  the  zero-crossing  description  at  the  scale  of  die  bricks  themselves  -  something 
that  was  earlier  considered  undesirable  for  the  description  produced  hv  the  text  ore  tokens.  1  o  show  drat 
this  affect  indeed  occurs,  a  technique  devised  by  Stevens  (1981b]  was  used  to  create  f  igure  4.3.  Ifns 
figure  is  composed  many  small  2x2  black  and  white  checkerboards.  Stevens  reasoned  that  if  such  small 
checkerboards  appeared  on  a  background  of  grey  that  is  the  psychophvsica:  nvcinge  of  the  hl.uk  and 
white,  then  the  output  of  any  smooth  convolution  opeiator  that  encompasses  sevei.il  of  these 
checkerboards  will  not  differ  significantly  from  that  operaloi  s  output  wlun  encompassing  gist  the  giev 
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Figure  4.3  The  texture  elements  in  this  figure  consist  of  collinc.ir  tuples  of  2x2  checkerboards,  which 
oriented  hori/ontalK  in  left  region  and  xeriicalK  in  the  right  region.  When  this  figure  is  provided  with 
the  matching  gre>  background,  there  is  no  scale  at  which  a  significant  change  occurs  in  the  orientation 
distribution  of  the  V  (i  zero-crossings  at  the  texture  houndarx  between  the  two  regions,  as  there  was  for 
figure  4.1. 
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oricntcd  element  and  the  90°  change  in  orientation  of  the  triples  defines  a  texture  boundary,  there  will  be 
no  scale  at  which  the  distribution  of  the  intensity  changes  can  be  used  to  identify  this  texture  boundary, 
when  this  figure  is  provided  with  the  matching  grey  background.  Figure  4.4  gives  the  V-G  zero-crossings 
for  Figure  4.3  near  the  texture  boundary.  At  the  smallest  scales  of  the  V-G  operator,  the  edges  of  the 
component  squares  of  checkerboards  arc  tracked  by  the  zero-crossings.  At  the  largest  scales,  as  expected, 
the  zero-crossings  are  of  low  amplitude  (amplitude  is  not  depicted  in  these  figures)  and  seem  to  meander 
randomly.  At  intermediate  scales,  parts  of  the  rough  boundary  defined  by  each  collinear  triple  appear  in 
the  zero-crossings,  but  many  zero-crossings  corresponding  to  tire  each  triple's  internal  structure  also 
appear.  Rut  at  no  scale  is  die  boundary  of  the  triples  made  explicit  and  dicir  internal  structure  filtered 
out  as  was  possible  for  the  w's  above,  making  extraction  of  the  triples'  orientation  and  the  texture 
boundary  non  trivial.  In  Section  7,  we  shall  sec  that  the  human  observer  can  rapidly  detect  a  boundary 
created  by  an  orientation  change  of  such  checkerboard  triples. 

To  reinforce  this  idea  diat  the  raw  intensity  changes  cannot  always  separate  the  effects  of 
different  processes,  a  second  example  will  be  given.  The  previous  example  showed  that  die  substructure 
of  an  item  can  influence  the  intensity  changes  at  large  enough  a'alcs  to  leave  that  item  only  implicit  in  die 
intensity  changes.  ITie  second  example  again  uses  items  at  two  different  scales,  but  this  time,  die  smaller 
items  are  not  components  of  the  larger  items,  but  instead  arc  independent  of  them.  Figure  4.5  consists  of 
line  segments  of  two  different  lengths.  The  shorter  line  segments  arc  oriented  at  45°  on  the  left-hand  side 
of  the  figure  and  at  -45°  on  die  right-hand  side,  while  the  longer  line  segments  arc  randomly  oriented 
across  the  figure.  Without  die  longer  line  segments,  there  would  be  a  sharp  orientation  change  in  the 
zero-crossings  at  the  scales  that  capture  the  smaller  line  segments.  Hie  randomly  oriented,  longer  line 
segments,  by  adding  noise  to  the  local  orientation  distributions,  weaken  this  sharp  change  in  the 
zero-crossings.  Ihtis.  we  again  have  an  example  where  items  from  one  process  interfere  with  (he  mteiiMt, 
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Figurt  4.5  The  shorter  line  segments  in  this  figure  arc  oriented  at  45°  on  the  left-hand  side  and  at  -45°  on  \ 

the  right-hand  side,  u hilc  the  longer  line  segments  arc  randomly  oriented  across  the  figure.  Without  the 
longer  line  segment-,,  there  uould  be  a  sharp  orientation  change  in  the  /ero-crossiiu  s  at  the  scales  that 
capture  the  smaller  line  segments.  The  longer  line  segments  weaken  this  change  in  the  zero  crossings. 
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changcs  that  best  capture  those  items  from  a  different  process.  The  information  necessary  to  separate 
these  two  kinds  of  items  is  clearly  present  in  this  image,  however:  it  is  contained  in  the  differing  lengths  of 
the  individual  line  segments  themselves. 

What  arc  the  texture  tokens? 

We  have  seen  above  that  the  raw  intensity  changes  appear  to  be  too  primitive  a  description  of 
image  texture  to  suffice  as  the  sole  texture  tokens.  In  the  above  two  examples,  it  is  groupings,  not 
individual  points,  of  the  intensity  changes  that  correspond  to  the  items  that  produce  the  texture  boundary 
-  the  oriented  triples  in  the  first  example  and  the  short  line  segments  in  the  second  example.  This 
suggests  that  some  form  of  local  grouping  of  the  intensity  changes  that  results  in  tokens  that  roughly 
correspond  to  individual  line  segments,  small  blobs,  local  clusters  and  collincar  groupings  of  these  could 
provide  a  description  of  die  local  structure  of  image  texture  that  better  separates  the  items  produced  by 
different  physical  processes.  Marr  [1976]  has  proposed  that  much  local  image  structure  can  be  made 
explicit  by  assigning  place  tokens  to  such  items  as  terminations,  small  blobs  and  line  segments,  which  are 
presumably  found  from  die  intensity  changes,  and  then  by  grouping  dicse  tokens  to  find  collincar 
groupings  and  local  clusters,  which  arc  then  also  assigned  places  tokens.  These  tokens  would  correspond 
to  small  markings,  scratches,  surface  elements  and  local  groupings  of  these  on  physical  surfaces.  It  is  not 
presently  clear  whether  the  early  representation  of  texture  requires  tokens  di;„  faithfully  and  precisely 
represent  these  kinds  of  items  everywhere  in  an  image.  Perhaps  some  computationally  less  expensive 
processing  that  roughly  identifies  a  sizable  fraction  of  such  items  would  suffice  at  this  stage,  w  ith  a  more 
precise  description  available  with  scrutiny  if  needed. 

Hxactly  what  the  texture  tokens  arc  thus  remains  an  open  question.  It  solution  is  important  not 
only  for  understanding  how  to  detect  texture  boundaries,  which  has  been  emphasized  here,  but  also  for 
depth  from  texture  and  motion  correspondence.  The  texture  tokens  could  provide  the  unloivshortened 


measure  needed  to  obtain  depth  from  texture,  as  discussed  in  the  introduction.  Further,  the  texture 
tokens,  like  the  texture  edge,  would  represent  larger  scale  and  rarer  primitives  for  motion  correspondence 
that  have  fewer  candidate  matches  over  a  given  range  than  the  intensity  changes.  But  being  more  precise 
about  these  processes  must  await  the  determination  of  the  texture  tokens,  and  not  much  can  be  said 
definitely  about  the  form  of  the  texture  tokens  at  this  point  other  than  it  appears  that  the  intensity 


changes  alone  will  not  suffice. 
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5.  Summary  of  the  Theory 

ITircc  physical  constraints  on  surface  structure... 

(!)  The  visible  world  can  be  regarded  as  being  compi'scd  of  smooth  surt'uiis  :>:g  o  '.\i  lams 
Junctions  whose  spatial  variation  may  be  complex. 

(2)  Physically  different  processes  operate  on  a  surface  to  form  different  k  aids  of  items  tin  . 

(3)  Surface  items  generated  by  the  same  processes  lend  to  be  mart  similar  to  on,  an,  •tin  >  .  n  tin  > 
shape,  lightness,  color,  and  spatial  arrangement  than  to  surface  items  general <  d  b\  otln  r  pi,  ,,  ■ 

...combined  with  the  goal  of  producing  the  2'iz-D  sketch,  a  viewer-centered  representation  of  the  visible 
surfaces  where  the  factors  that  produce  an  image  -  surface  geometry,  surface  reflectance,  illumination, 
and  viewpoint  --  are  separated,  lead  to  the  following  conclusions  for  the  representation  of  the  image 

texture: 

(1)  A  texture  edge  primitive  is  needed  to  identify  texture  change  contours,  which  are  formed  by  an 
abrupt  change  in  the  2-0  organization  of  similar  items  in  an  image.  The  texture  edge  can  be  used 
for  locating  discontinuities  in  surface  structure  and  surface  geometry  and  for  establishing  motion 
correspondence. 

(2)  Abrupt  changes  in  attributes  that  vary  with  changing  surface  geometry  --  orientation,  density, 
length,  and  width  --  should  be  used  to  identify  discontinuties  in  surf.i  e  geometry  and  surface 
structure. 

(3)  Texture  tokens  are  needed  to  separate  the  effects  of  different  physical  processes  operating  on  a 
surface.  Ilicy  represent  the  local  structure  of  the  image  texture.  I  heir  spatial  variation  can  be  used 
in  the  detection  of  texture  discontinuities  and  texture  gradients,  and  their  temporal  variation  may 
he  used  for  establishing  motion  correspondence.  Wh.it  precisely  constitutes  the  Iconic  tokens  is 
unknown:  it  appears  however,  that  the  intensity  changes  alone  will  not  suliite.  but  local  groupings 
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of  them  may. 

(4)  I'he  above  primitives  need  to  be  assigned  rapidly  over  a  large  range  in  an  image. 
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6.  Texture  Edge  Demonstrations 

The  primary  purpose  of  this  section  is  to  present  psychophysical  evidence  that  texture  edges  are 
detected  by  the  human  visual  system  and  that  they  are  represented  over  a  large  range  in  an  image. 
The  secondary  purpose  is  to  characterize  those  types  of  texture  changes  that  can  give  rise  to 
perceived  texture  edges. 

Texture  discrimination  and  texture  edges 

Most  previous  psychophysical  studies  of  visual  texture  have  concentrated  on  their 
discrimination  (c.g.  Julcsz  [1973,1981]  and  Beck  [1966]).  For  example,  in  Figure  6.1  we  can 
immediately  see  without  scrutiny  that  the  lower  left  region  of  the  textured  pattern  is  different  from 
the  rest  of  the  pattern;  we  can  discriminate  the  regions.  In  Figure  6.2,  the  textured  pattern  looks 
homogeneous  without  scrutiny  even  though  the  upper  right  corner  is  composed  of  backward  R's, 
while  the  remainder  of  the  pattern  is  composed  of  forward  R's  [Julcsz.  1973],  In  this  ease,  we  cannot 
discriminate  the  regions.  Several  theories  have  been  advanced  to  explain  why  some  textures  are 
discriminable  while  others  are  not,  with  Julesz’s  second-order  statistic  conjecture  probably  the  best 
known  [Julcsz.  1973]. 

The  problem  with  applying  texture  discrimination  to  the  texture  edge  problem  is  that  texture 
discrimination  is  an  "anything  goes"  task;  the  viewer  may  use  any  means  at  his  disposal  to  try  to 
discriminate  the  textures  within  the  allotted  time.  Suppose  a  viewer  is  asked  which  one  of  four 
quadrants  of  a  texture  pattern  is  different  from  the  others  (as  in  Figure  6.1)  and  suppose  that  he 
correctly  identifies  that  quadrant.  Did  he  find  the  correct  quadrant  by  first  finding  the  texture 
boundary  between  the  different  regions,  or  did  he  instead  sample  four  elements,  one  from  each 
quadrant,  and  compare  them?  Because  it  is  conceivable  that  texture  disci immulion  can  incut  at 


-43- 


V***  444  t>4 \  *  A 

A  >  *4  *  rtA/t>Vvt>^0 
A^4  *<*><  ^ 

^  A  s\  .  .  _ 


b  A\  A.  t>  A.  -  * 

4  -~  4 


<J  t?<7  “  4  4  ^  , 

t>.4.  A  A  4  V  V  <3  x7  7  A 


Va%  A  : 7  4  a«  *< 

►  /vt>  4j>£>4'4A4f;,A4 
<l  n  t>  4  A  A  _  <  V  t>  A  t>  * 


k> 

4  >  >4AA  <  Va 

4  4  p  A.  <1  7a^4<44 
4  <  4  t>  vA  x>  y :<,  4  4  ^ 

***  **  * V 

******* 

j,.  -V  ^  y.  ^  A  4  A  4  ^  ^ 

***^*.  ***  V  t>  *  4  4  4 

*  VV  VA 


v 

VJ  V)  ▼ 

A  <J  t>4 
4 

4  v 

«>*«** 
>  t>  4 
A  4  4 


A 


*1-*^ 

Y~  V  O.  *  "*  4  4 

*a  W^-Vw^V  ptn 


Figure  6.1  A  discrim mable  texture.  The  lower  left  region  can  be  seen  immediately  to  have  a  different 
texture  from  the  rest  of  the  figure. 
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Figure  6.2  An  indiscriminable  texture.  The  figure  initially  appears  homogeneous  Close  inspection 
reveals  that  upper  right  region  is  composed  of  backward  Ks,  while  the  rein. under  ol  liguie  e-  composed  of 
forward  K  s|hilev  1(>7.1| 
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Icast  in  some  cases  without  the  texture  boundaries  being  explicitly  represented,  such  texture 
discrimination  studies  cannot  be  used  as  evidence  that  texture  edges  are  detected  by  die  human 
visual  system.  For  our  purposes,  these  studies  can  only  show  that  there  arc  some  texture  differences 
(c.g.  Figure  6  2)  for  which  texture  edges  arc  not  detected,  since  if  they  were  detected,  we  could 
presumably  discriminate  them.  Hut  given  die  "anything  goes"  nature  of  die  discrimination  task,  it 
can  not  be  assumed  that  all  discriminablc  textures  have  their  boundaries  explicitly  represented. 
This  means  that  different  paradigms  to  study  texture  edges  must  be  utilized. 

The  apparent  motion  paradigm 

It  was  suggested  earlier  that  texture  boundaries  could  be  used  to  establish  motion 
correspondence.  We  can  test  this  hypodicsis  and  test  the  human  ability  to  perceive  texture  edges  by 
using  an  apparent  motion  paradigm.  It  is  well  known  dial  if  a  display  sequence  such  as  Figure  6.3a 
followed. by  Figure  6.3b  is  presented  to  a  viewer  with  a  short  (say  30  msec)  interstimulus  interval 
( I S I ).  the  viewer  will  perceive  apparent  motion  -  in  diiscasc  a  single  square  will  be  seen  to  move  to 
the  right  and  rotate  45°.  Interestingly,  if  the  straight  line  sides  of  the  square  are  replaced  by  texture 
edges,  the  correspondence  can  still  be  achieved.  When  die  s  *encc  in  Figure  6.4  is  presented,  the 
whole  pattern  is  seen  to  move  to  the  right  w  ith  the  embedded  square  appearing  to  both  move  to  the 
right  and  rotate  45°.  Mere  the  texture  boundary  is  formed  by  a  60°  orientation  difference  in  the 
small  line  segments.  Typically,  an  embedded  square  of  about  5°  visual  angle  and  a  presentation 
sequence  of  300  msecs  was  used  for  each  frame  with  an  ISI  of  30  msecs,  but  the  correspondence  can 
be  achieved  over  a  wide  range  of  visual  angle  and  does  not  depend  critically  on  the  ISI.  It  will  be 
shown  below  that  there  is  no  intensity  edge  at  any  stale  present  at  the  boundary  between  the  two 
textured  regions  so  the  correspondence  must  be  established  horn  the  texture  difference. 


Rumathundran.  et  al  1 1  *>7  3)  have  repoiied  establishing  app.uent  motion  using  a  texture 


KiRurc  6.3  An  apparent  motion  sequence.  Oisplav  (a)  is  presented  for  300  msec,  a  blank  displat,  follows 
for  30  msec,  and  llien  Displav  (It)  is  presented  for  300msee.  I  lie  viewer  perceives  a  single  square  moving 
to  the  right  <md  rotating. 
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Figure  6  4  Apparent  motion  that  tixcs  texture  edges.  As  in  Figure  6.1  Display  (a)  i-.  presented  for  .WO 
niece,  a  blank  display  follows  for  30  msec,  and  then  Display  tbi  is  presented  for  300  msec.  I  he  viewer 
perceives  the  whole  pattern  moving  to  the  right  with  (lie  embedded  square  appearing  both  to  move  to  the 
right  and  rotate  line  appaient  motion  pai.idigm  can  he  used  to  lest  for  tlio-e  texture  changes  that 
produce  cle„rh  perceived  texture  boundaries. 
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boundary  with  a  second-order  statistical  difference  (with  equal  first-order  statistics),  in  their 
paradigm,  an  embedded  square  is  translated  but  not  rotated.  This  latter  format  has  the 
disadvantage  for  our  uses  that  the  direction  in  which  the  embedded  square  of  different  texture  is 
moved  can  be  perceived  even  when  its  boundary  is  only  weakly,  if  at  all,  perceived.  By  adding  the 
rotational  component  to  the  embedded  square's  motion,  only  a  clearly  perceived  boundary  gives 
rise  to  a  square  that  appears  to  both  translate  and  rotate.  lTie  key  point  here  is  that  unlike  the 
texture  discrimination  tasks,  it  is  difficult  to  imagine  how  a  viewer  successfully  can  complete  this 
motion  task  without  his  visual  system  making  explicit  the  boundary  between  the  two  regions  of 
differing  texture. 

The  static  shape  recognition  paradigm 

A  second  paradigm  that  involves  static  shape  recognition  can  also  prov  ide  evidence  of  human 
ability  to.  perceive  texture  edges.  If  an  embedded  figure  in  a  texture  pattern  is  sufficiently  complex 
in  shape  and  can  still  be  recognized  without  scrutiny,  then  it  seems  likely  that  that  shape’s  boundary 
is  detected  by  the  visual  system.  In  Figure  6.5,  which  uses  the  same  texture  change  as  in  the  motion 
example,  there  is  little  difficulty  in  recognizing  which  letter  of  the  alphabet  corresponds  to  the 
embedded  shape.  Ihus,  this  gives  evidence  from  two  independent  techniques  -  the  apparent 
motion  paradigm  and  the  static  shape  recognition  paradigm  -  that  a  particular  kind  of  texture 
boundary  (one  formed  by  a  90°  difference  in  small  line  segments)  is  detected  by  the  visual  system. 
Kidd.  Frisby  and  Mayhew  [1979]  have  found  that  texture  boundaries  can  initiate  vcrgcncc 
movements  for  stercopsis.  This  could  serve  as  a  basis  for  a  third  paradigm  for  studying  texture 
boundaries,  but  this  has  not  been  investigated  here. 

An  orientation  difference  of  line  segments  is  not  the  only  sort  of  texture  boundary  that  is 
successful  in  the  apparent  motion  and  shape  recognition  paradigms.  Figure  6.6  shows  i  difference 
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Figure  6.5  Tho  shape  of  the  embedded  reeum  with  tine  segments  of  differing  orientation  can  be 
rceoeni/cd  e.isih  as  the  iettei  /.  I  Ins  shape  recognition  paradigm  provides  a  second  test  for  those  texture 
changes  that  produce  cleat  It  peiceived  texiiiu  boundaries. 


-49- 


in  the  dot  density  (4:1)  that  results  in  immediate  shape  recognition.  In  the  apparent  motion 
paradigm,  the  same  texture  change  results  in  the  embedded  square  being  perceived  as  mos  ing  to 
the  right  and  rotating.  There  arc  many  sorts  of  texture  changes  that  fail  in  both  the  shape 
recognition  and  motion  paradigms.  Figure  6.7  show  several  types  of  texture  changes  for  which 
static  shape  recognition  is  difficult  without  scrutiny.  ihese  same  texture  changes  do  not  result  in 
the  correspondence  of  the  embedded  square  in  the  apparent  motion  paradigm;  no  embedded 
square  is  seen  moving  to  the  right  and  rotating.  In  particular.  Figure  6.7c,  which  fails  the  tests  for 
perceived  texture  edges,  passes  the  Julcsz-stylc  test  for  texture  discrimination  (Figure  6.1).  While 
some  texture  boundaries  result  in  motion  correspondence  and  shape  recognition  and  others  do  not. 
in  all  the  texture  boundaries  that  have  been  tried,  motion  correspondence  is  established  if  and  only 
if  shape  recognition  is  immediate.  This  strengthens  the  hypothesis  that  texture  edges  are  explicitly 
represented  by  the  visual  system. 

Texture  edges  are  not  always  explicitly  present  in  the  zero-crossings 

It  was  claimed  that  in  Figure  6.5  there  is  no  average  intensity  change  at  the  texture  boundary  at 
any  scale,  and  thus  this  boundary  is  not  explicit  in  the  intensity  changes.  Hiis  claim  can  be 
substantiated  by  convolving  the  figure  with  several  sizes  of  the  V2G  mask  of  Marr  and  Hildreth 
[1980],  and  examining  the  zero-crossings  in  the  output.  As  described  earlier  in  Section  4,  the 
zero-crossings  of  a  V2G  operator,  which  is  the  composition  of  a  Gaussian  and  the  laiplacian. 
identify  the  locations  of  the  intensity  changes  at  the  scale  determined  by  the  bandwidth  of  the 
Gaussian.  Figure  6.8  shows  the  zero-crossings  in  the  convolutions  of  a  portion  of  tire  texture 
boundary  in  Figure  6.5  with  V?G  masks  of  various  sizes.  Note  at  the  smallest  scale,  the  individual 
line  segments  are  captured,  and  at  the  largest  scale  the  the  external  boundary  is  captured,  but  at  no 


scale  is  the  boundary  between  the  two  regions  present  in  the  /ero-crossings. 


Figure  6.6  A  4:1  dot  density  difference  can  also  give  rise  to  shape  recognition.  The  viewer  can  recognize 
immediately  the  shape  ol  the  embedded  region  of  greater  densitv  as  the  letter  I  In  the  density  case. 
howocr.  it  is  difficult  to  separate  experimentally  the  lelati.c  inlluences  of  large  scale  intensity  changes 
and  changes  in  token  density  at  the  perceived  boundary. 
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Figure  6.7  Several  texture  changes  for  which  immediate  shape  recognition  is  difficult.  Close  examination 
of  each  pattern  reveals  that  die  embedded  shape  is  (a)  the  ietlei  H.  (b)  the  letter  V.  and  (e)  the  letlci  /.. 
Note  that  the  texture  change  in  (c)  is  the  same  as  in  the  "diserimmable"  Figure  6.1. 


f 

Figure  6.8  I  he  zero-crossings  for  the  texture  change  in  Figure  6.5  using  V‘G  operators  of  various  si/cs. 

The  leftmost  figure  of  each  row  depicts  the  image  used  to  produce  the  zero-crossings  in  that  row.  ITtc 
number  adjacent  to  each  figure  gives  the  diameter  of  the  excitatory  region  of  the  operator  used  to 
produce  the  zero-crossings  in  that  figure,  where  each  line  segment  is  9  units  long.  At  no  scale  is  the 
boundary  between  the  two  regions  of  different  line  segment  orientation  explicitly  dcinarkcd  bv  a 

zero-crossing  contour.  j 


-53- 


In  Figure  6.6,  there  is  a  large  scale  intensity  change  that  could  be  used  to  identify  the  embedded 
region's  boundary  (this  easily  is  seen  to  be  true  by  viewing  the  figure  from  far  enough  away  that  the 
individual  dots  are  not  resolvable  -  the  embedded  shape  can  still  be  perceived  due  to  the  large  scale 
intensity  change).  The  fact  that  a  texture  boundary  that  is  due  to  changing  texture  element  density, 
length  or  width  is  often  accompanied  by  a  large  scale  intensity  that  coincides  with  the  texture 
boundary  makes  it  difficult  to  access  experimentally  if  these  texture  changes  result  in  perceived 
texture  edges  in  the  absence  of  these  large  scale  intensity  changes;  further  work  is  needed  in  this 
area.  Orientation  changes  have  been  emphasized  in  this  paper,  since  they  arc  free  of  this 
complication. 

Image  range  of  the  texture  edge  primitive 

Motion  correspondence  and  shape  recognition  can  be  achieved  with  these  figures  as  large  as 
30-40°  in  visual  angle;  at  this  size,  local  scrutiny  could  reveal  only  a  small  portion  of  the  boundary 
at  a  given  time.  But  the  motion  correspondence  is  immediate,  and  shape  recognition  can  still  occur 
when  a  figure  is  briefly  flashed  (300  msec).  This  supports  the  hypothesis  that  many  texture  edges 
arc  being  simultaneously  found  over  a  large  portion  of  the  image. 

Characterizing  those  texture  changes  that  produce  perceived  texture  edges 

A  complete  characterization  of  those  texture  changes  that  produce  perceived  texture  edges  and 
those  that  do  not  (as  evidenced  by  the  above  apparent  motion  and  shape  recognition  paradigms)  has 
yet  to  emerge.  A  complete  phenomenological  characterization  is  difficult  to  obtain  because  there 
may  be  many  attributes  (e.g.  contrast,  color,  orientation,  density,  length)  that  the  visual  system  can 
use  to  detect  texture  edges,  and  new  attributes  can  always  be  proposed  that  have  yet  to  be  tested 
psychophysically.  f  urther,  it  is  difficult  to  separate  some  attributes  experimentally,  such  as  texture 
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element  density  from  average  local  intensity,  as  discussed  above.  Nevertheless,  two  rules  seem  to 
characterize  many  of  those  texture  changes  that  can  and  cannot  produce  perceived  texture  edges. 

The  first  rule  is  that  significant,  abrupt  changes  in  attributes  that  vary  with  changing  surface 
geometry  produce  perceived  texture  edges  This  has  already  been  shown  to  be  the  case  above  with 
the  orientation  of  texture  elements.  Intensity,  density,  and  size  changes  of  texture  elements  can  also 
produce  perceived  texture  boundaries,  but  further  work  is  needed  to  decouple  the  large  scale 
intensity  changes  from  the  density  and  size  changes  to  access  each  attribute's  individual  effect. 
Conversely,  the  textures  in  Figure  6.7  were  generated  by  holding  constant  average  local  texture 
clement  density,  orientation,  length  and  width,  but  otherwise  using  different  shaped  texture 
elements  across  the  texture  boundary.  Kven  though  there  arc  significant  structural  differences  in 
the  texture  elements  across  the  boundary,  such  as  the  number  of  terminations  and  corners  these 
changes  alone  do  not  produce  perceived  texture  edges.  In  fact,  texture  element  color  and  contrast 
arc  the  only  attributes  that  do  not  (usually)  vary  appreciably  with  changing  surface  geometry  that 
have  been  found  so  far  to  produce  perceived  texture  edges.  This  contrasts  with  Julcsz.’s  results  for 
texture  discrimination  which  indicate  that  changes  in  the  number  of  terminations  can  apparently  be 
used  to  discriminate  textured  regions  [Jules/.  1981].  As  mentioned  earlier,  texture  diseriminability 
docs  not  insure  that  a  clear  texture  boundary  will  be  perceived. 

This  first  rule  is  not  surprising  in  light  of  the  discussion  in  Section  3  on  the  uses  of  texture  edges. 
Reiterating  what  was  said  there,  texture  edges  can  identify  discontinuities  in  surface  geometry  and 
surface  structure.  At  a  texture  discontinuity  where  surface  geometry  changes  but  surface  structure 
docs  not,  it  will  be  those  image  attributes  that  vary  with  surface  geometry  -  c.g.  orientation, 
density,  length,  width  -  that  can  be  used  to  identify  die  discontinuity  in  the  image.  At  a 
discontinuity  where  suiface  structure  changes,  everything  is  likely  to  change  --  orientation,  density 
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color,  contrast,  size.  Further,  the  presence  or  absence  of  geometric  invariants  such  as  similarly 
oriented  items  at  a  given  scale  that  remain  oriented  across  a  texture  boundary  can  be  used  to 
distinguish  between  these  two  kinds  of  discontinuities.  Thus,  while  structural  attributes  such  as 
number  of  terminations  and  corners  could  help  detect  changes  in  surface  structure  when  geometric 
attributes  such  as  orientation,  density,  and  size,  all  happen  to  be  constant  across  a  texture 
discontinuity,  the  visual  system  could  consider  such  an  occurrence  too  unlikely  in  natural  images  to 
justify  its  detection. 

The  second  rule  is  that  the  comparison  of  distributions  of  a  given  attribute  of  otherwise  similar 
texture  elements  is  kept  simple.  This  rule  is  detailed  here  only  for  the  orientation  attribute.  Figure 
6.9  shows  that  the  oriented  line  segments  at  two  fixed  orientations  (45°  and  -45°)  found  inside  the 
embedded  Z-shapcd  region  arc  sufficient  to  match  the  randomly  oriented  line  segments  found 
outside  the  embedded  region  -  the  embedded  letter  is  difficult  to  recognize  quickly,  l.ikewise,  the 
same  texture  change  docs  not  produce  motion  correspondence  in  the  apparent  motion  paradigm. 
This  suggests  that  the  visual  system  may  assume  that  the  orientation  distribution  of  items  at  a  given 
scale  cither  clusters  around  a  single  value  or  is,  for  all  intents  and  purposes,  random.  A  process  that 
naturally  produces,  say,  a  distinct,  two-peaked  orientation  distribution  (as  45°  and  -45°)  of 
otherwise  identical  items  would  be  deemed  too  rare  to  be  worth  distinguishing  from  a  random 
distribution.  Incidentally,  this  contrasts  with  previous  work  by  the  author  using  texture 
discrimination  instead,  for  which  three  orientations  were  found  necessary  to  match  random 
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Figure  6.9  Two  fixed  orientations  (45°  and  -45°)  of  the  line  segments  inside  the  embedded  region  match 
the  random  orientations  of  the  line  segments  outside  the  embedded  region;  the  embedded  shape  is 
difficult  to  recognize  initially  as  the  letter  H. 


7.Texture  Token  Demonstrations 


In  this  section,  psychophysical  demonstrations  are  presented  that  the  elementary  tokens  that  the 
human  visual  system  uses  to  represent  the  local  structure  in  image  texture  do  not  consist  solely  of 
the  raw  intensity  changes  at  a  variety  of  scales  in  an  image.  Specifically,  demonstrations  will  be 
given  that  there  are  no  significant  changes  in  the  orientation  distribution  of  the  V2G  zero-crossings 
at  any  scale  that  can  be  used  to  detect  some  texture  boundaries  that  humans  can  readily  perceive. 
Two  different  approaches  arc  taken  to  create  these  demonstrations. 

The  Checkerboard  Paradigm 

The  first  approach  utilizes  the  checkerboard  technique  described  in  Section  4.  The  general  idea 
is  to  use  small  black  and  white  checkerboards  as  component  items  in  larger  scale  groupings  so  that 
the  larger  scale  groupings  will  not  be  explicit  in  the  larger  scale  intensity  changes  due  to  the 
integrating  effects  of  the  convolution  operator.  In  particular,  each  dot  in  Figure  7.1  can  be 
replaced  by  a  small  2x2  black  and  white  checkerboard  and  the  entire  figure  given  the  matching  grey 
background  that  is  the  psychophysical  average  of  the  black  and  white  (sec  Figure  4.3).  ITiis  match  is 
achieved  by  viewing  the  checkerboards  from  sufficiently  far  away  and  adjusting  the  background 
grey  until  the  checkerboards  disappear.  Under  these  conditions,  the  embedded  letter,  which  can 
easily  be  perceived  in  the  unmodified  Figure  7.1,  can  still  be  immediately  recognized  in  the  so 
modified  figure,  provided  the  figure  is  viewed  from  sufficiently  close  in  (otherwise,  if  the  viewer 
moves  back  from  the  figure,  the  checkerboards  eventually  begin  to  disappear,  with  those  toward  the 
periphery  being  affected  first). 

Ihc  previous  section  argues  that  the  above  is  evidence  that  a  texture  change  consisting  of  a  forge 
change  in  the  orientation  of  unilinear  triples  of  tiny  2x7  checkerboards  can  be  identified  by  lire 
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Figurc  7.1  When  the  dots  in  this  figure  arc  replaced  by  small  2x2  black  and  white  checkerboards  and  the 
entire  figure  is  giv  en  the  matching  grey  background  that  is  the  psychophysical  average  of  the  black  and 
white,  the  embedded  shape  can  still  be  recognized  as  the  letter  T.  f  igure  V4  showed  that  at  no  scale  is  the 
boundary  between  the  two  regions  of  different  checkerboard  triple  orientation  explicitly  demarked  by  a 
zero-crossing  contour,  nor  is  there  a  significant  change  in  the  local  orientation  distribution  of  the 
zero-crossings  at  the  texture  b<  indary. 
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human  visual  system.  Since  any  smooth  spatial  operator  that  encompasses  several  of  these 
checkerboards  will  respond  with  the  same  output  that  is  given  to  the  grey  background,  there  is  no 
intensity  change  at  any  scale  at  the  boundary  between  the  two  textured  regions.  Of  crucial 
importance  here  is  the  fact  that  the  orientation  defined  by  checkerboard  triples  is  not  explicit  in  the 
intensity  changes  cither.  As  shown  in  Section  4,  the  V2G  zero-crossings  at  no  scale  make  explicit  the 
boundaries  of  individual  triples  while  filtering  out  their  internal  structure,  and  thus  the  changing 
orientation  of  the  triples  at  the  texture  boundary  cannot  be  found  by  looking  for  a  significant 
change  there  in  the  local  orientation  distributions  of  zero-crossings  of  V2G  operators  at  sonic  scale 
(see  Figure  4.4). 

Mixed  lengths  pamdigm 

Ihe  second  approach  taken  to  demonstrate  that  the  raw  intensity  changes  are  not  sufficient  as 
the  sole  texture  tokens  utilizes  texture  elements  of  two  different  lengths,  flic  general  idea  is  that  if 
one  set  of  texture  elements  of  a  given  length  has.  say,  some  oriented  structure  in  a  texture,  then  this 
oriented  structure  will  be  easier  to  detect  in  the  presence  of  other  texture  elements  of  a  very 
different  length  than  in  the  presence  of  other  texture  elements  of  a  similar  length  provided  the 
texture  elements  are  first  separated  on  the  basis  of  their  length.  Figure  7.2a  shows  a  texture  pattern 
composed  of  line  segments  of  two  different  lengths.  The  shorter  line  segments  arc  oriented  at  65° 
inside  the  embedded  H-shapcd  region  and  at  25°  outside  this  region.  ITic  larger  line  segments  are 
nine  times  as  long  as  the  shorter  line  segments  and  are  oriented  at  45°  throughout  the  texture 
pattern.  Figure  7.2c  shows,  for  reference,  just  the  shorter  lines  found  in  Figure  7.2a.  Figure  7.2b 
contains  an  identical  copy  of  the  shorter  line  segments  found  in  Figure  7.2a.  but  the  larger  line 
segments  have  been  shrunk  1  /9th  in  length  (to  the  same  length  as  the  other  line  segments)  with  a 
corresponding  nine  fold  increase  in  their  density  (i.c.  number/area).  thus  keeping  die  total  amount 
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(a) 


(b) 


(c) 

Figure  7.2  The  creation  of  texture  patterns  (a)  and  (b)  both  begin  with  underlying  pattern  (c),  which  has 
line  segments  at  65°  inside  the  embedded  region  and  at  25°  outside  this  region.  Masking  45°  line 
segments  nine  times  as  long  as  those  in  pattern  (c)  and  with  one  ninth  the  density  (number/area)  arc 
added  to  complete  pattern  (a).  Masking  45°  line  .segments  of  the  same  length  and  with  the  same  density 
as  pattern  (c)  arc  added  to  complete  pattern  (b).  I  he  embedded  II  in  pattern  (a)  is  easier  to  recognize 
than  that  in  pattern  (h).  .in  ef  fect  that  is  accentuated  at  oblique  or  distant  mow  points,  (  his  result  is 
difficult  to  explain  it  the  raw  intensity  changes  at  various  scales  are  the  sole  texture  tokens. 
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of  45°  contour  over  a  given  area  constant.  Thus,  measuring  the  amount  of  contour  at  a  given 
orientation  per  unit  area  taken  from  very  local  descriptions  of  the  intensity  changes  found  in  an 
image  of  these  figures  would  not  show  significant  differences  between  Figure  7.2a  and  Figure  7.2b. 
Figure  7.3  and  Figure  7.4  contain  V2G  zero-crossings  at  various  scales  near  the  embedded  texture 
boundary  of  Figure  7.2a  and  of  F'igure  7.2b,  respectively.  They  were  generated  to  show  that  for  no 
scale  (operator  size)  is  there  a  significant  difference  between  the  local  orientation  distributions  of 
zero-crossings  for  Figure  7.3  and  F'igure  7.4  that  would  result  in  a  noticeable  difference  between  the 
detectability  of  the  embedded  region  in  Figure  7.2a  and  Figure  7.2b.  At  the  smaller  scales,  the 
zero-crossings  where  the  line  segments  of  different  orientations  cross  arc  very  similar  for  Figure  7.2a 
and  Figure  7.2b,  and  since  the  local  amount  of  contour  at  each  orientation  is  the  same  in  both 
figures  by  design,  the  local  zero-crossing  distributions  of  the  two  figures  at  these  smaller  scales  arc 
quite  similar.  At  the  larger  scales,  the  smaller  line  segments  are  not  resolved;  since  the  smaller  line 
segments  carry  the  orientation  change  that  produces  the  texture  boundary,  differences  in  the  local 
zero-crossing  distributions  of  the  two  figures  at  larger  scales  arc  not  relevant  to  the  detectability  of 
the  texture  boundary.  thus,  if  texture  boundary  detection  were  based  on  identifying  significant 
changes  in  the  distribution  of  zero-crossings  at  the  boundary,  the  texture  boundaries  in  F'igure  7.2a 
and  Figure  7.2b  should  have  similar  detectability.  Note,  however,  that  in  Figure  7.2a,  the 
embedded  letter  is  easier  to  recognize  than  in  Figure  7.2b,  an  effect  that  is  accentuated  at  distant  or 
oblique  viewpoints.  This  suggests  that  the  line  segments  are  somehow  first  separated  on  the  basis 
of  their  length. 

Ihis  result  may  seem  at  odds  with  those  due  to  Trcisman  (1977, 1980].  She  found,  using  a  variety 
of  techniques,'  that  human  observers  were  very  poor  at  the  prc-attentive  selection  of  items  having 


the  conjunction  of  two  or  more  attribute  values  (c.g.  shape:  1 1  and  coloured)  in  a  field  ol  dislractors. 


Figure  7.3  Zero -crossings  for  the  texture  change  in  Figure  7.2a  using  operators  of  various  sizes. 
Again,  the  leftmost  figure  of  each  row  depicts  the  image  used  to  produce  the  zero-crossings  in  that  row. 
and  die  number  adjacent  to  each  figure  gives  the  diameter  of  the  excitatory  region  of  the  V  G  operator 
used  to  produce  the  zero-crossings  in  that  figure,  where  the  shorter  line  segments  are  9  units  long. 
Comparison  w  ith  Figure  7.4  reveals  that  at  the  smaller  scales,  there  is  no  significant  difference  in  the  local 
orientation  distribution  of  the  zero-crossings  between  the  two  figures,  while  at  the  larger  scales  the  smaller 
line  segments,  which  contain  the  boundary -forming  orientation  change,  are  not  resolved,  l'hus,  the 
results  in  Figure  7.2  cannot  be  explained  if  the  texture  boundary  is  delected  solely  on  the  basis  of 
significant  changes  in  the  local  zero-crossing  distribution  across  the  boundary. 


Figure  7.4  Zero-crossings  for  the  texture  change  in  Figure  7.2b  using  VC,  operators  i  f  \arious  sizes, 
w  ith  the  same  fonnat  as  Figure  7.3. 
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In  Figure  7.2,  the  selected  attributes  arc  orientation  and  scale  (length  of  line  segment).  A  possible 
explanation  is  that  scale  is  indeed  special  as  suggested  earlier  -  large  differences  in  size  may  not  be 
treated  like  other  variations  in  attribute  values,  since  they  strongly  suggest  that  different  processes 
are  responsible  for  the  respective  items. 
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8.  Summary  of  Demonstrations 

(1)  Two  different  experimental  paradigms  --  one  based  on  static  shape  recognition  of  a  textured 
region  embedded  in  a  textured  surround  and  one  based  on  motion  correspondence  of  texture 
boundaries  -  support  the  hypothesis  that  some  kinds  of  texture  boundaries  are  detected  by  the 
visual  system  and  are  made  explicit  in  a  representation  that  covers  a  large  range  in  an  image. 

(2)  V^G  zero-crossing  results  indicate  that  there  are  no  significant  intensity  changes  at  any  scale 
coincident  with  the  texture  boundaries  in  the  above  figures  and  thus  the  detection  of  these 
boundaries  must  be  based  on  more  abstract  texture  measures. 

(3)  Two  rules  characterize  many  of  the  texture  changes  that  can  and  cannot  produce  perceived 
texture  edges  as  evidenced  by  the  experimental  paradigms  in  (1): 

(a)  Significant,  abrupt  changes  in  texture  element  attributes  that  vary  with  changing  surface  geometry 
--  orientation,  length,  density,  width  --  produce  perceived  texture  edges. 

(b)  The  comparison  of  distributions  of  a  given  attribute  of  othenvise  similar  texture  elements  is  kept 
simple  -  e.g.  two  fixed  orientations  are  sufficient  to  match  random  orientations  in  the  texture 
boundary  paradigms. 

(4)  Two  different  experimental  paradigms  --  one  using  oriented  groupings  of  2x2  checkerboards  and  one 
using  line  segments  of  two  different  lengths  combined  with  V2G  /jro-crossing  results  cast  doubt  that  the 
raw  intensity  changes  at  various  scales  would  suffice  as  the  sole  texture  tokens;  there  are  no  significant 
changes  in  the  distribution  of  the  V2G  zero-crossings  at  any  scale  at  the  texture  boundaries  found  in  these 
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demonstrations. 
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